CSE 521 Winter 2006 Notes on Hashing
نویسنده
چکیده
The dictionary data structure is ubiquitous in computer science. A dictionary holds a set of items from an ordered universe and supports the operations of inserting an item, deleting an item, and searching for an item (membership queries). This is the dynamic dictionary problem. If the set of items is fixed and never changes, we only need to efficiently support membership queries (of the form “x ∈ S?”), and this is called the static dictionary problem. The fundamental “time-space” trade-off in any data structures problem is the space needed to store the data structure, and the time required to perform the operations (such as updates and answering queries). Let us consider this trade-off for the dictionary data structure. Let us assume the universe U is the set [N ] = {0, 1, 2, . . . , N − 1}, and we need to store a subset S ⊆ U of n N items. A trivial solution is to store a bit array of size N that holds the characteristic vector of the set. This supports O(1) time operations but has very poor storage. On the other hand, at the other extreme, we can store the set as a linked list, which has optimal O(n) storage, but requires Θ(n) time for operations. A binary search tree improves upon both of these with O(n) storage and O(log n) time operations. For the static problem, a sorted array supports search in O(log n) time. The logarithmic time is in fact optimal in the comparison model (if only allowed operations are comparisons between integers). The objective of hashing is to use other operations on integers and simultaneously achieve optimal O(n) storage as well as O(1) time for the operations.
منابع مشابه
Lecture Notes on Biological Sequence Analysis
Bibliography 81 Preface These are the lecture notes from CSE 527, a graduate course on computational molecular biology I taught at the University of Washington in Winter 2000. The topic of the course was Biological Sequence Analysis. These notes are not intended to be a survey of that area, however, as there are numerous important results that I would have liked to cover but did not have time.
متن کاملCSE 190A Project Proposal: 3D Photography
This paper presents a research proposal for CSE 190A: Projects in Vision and Learning, Winter 2007, on the subject of 3D photography. It addresses the objectives, datasets, milestones, and student qualifications for the project.
متن کاملCSE 599 b : Cryptography ( Winter 2006 ) Lecture 14 : Cryptographic Hash Functions 17 February 2006
and thus being a universal hash function family is equivalent to having a probability distribution on functions from D to R that maps elements of D in a uniform pairwise independent fashion. Typically we will consider D = {0, 1} and R = {0, 1} form < n. The following construction due to Dietzfelbinger is particularly convenient: The space of keys is all strings K = (a, b) where a, b ∈ {0, 1} an...
متن کاملNotes about d-cuckoo hashing
We show an upper bound for the minimal size set of keys which cannot be inserted into a d-cuckoo hashing table independently of the hashing functions and the insertion algorithm. We also discuss the probability for a successful insertion into such a hash table.
متن کاملAlgorithms for Phylogenetic Reconstructions
Preface As the title implies, the focus of these notes is on ideas and algorithmic methods that are applied when evolutionary relationships are to be reconstructed from molecular sequences or other species-related data. Biological aspects like the molecular basis of evolution or methods of data acquisition and data preparation are not included. A number of textbooks have been published covering...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006